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The architecture of carry chains in Field-Programmable Gate Array (FPGA) is introduced in this paper. The 
propagation delay time of the rising and falling edges in the carry chains are calculated according to the archi- 
tecture and they are predicted not equal in most cases. Tests show that the measuring results of the propagation 
delay time in EP3C120F484C8N series FPGA of Altera are in line with the inference. The difference of propa- 
gation delay time results in different accuracies of Time-to-Digital Converter (TDC). This phenomenon shall be 
considered in the design of TDC implemented in FPGA. It can ensure better accuracy. 
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I. INTRODUCTION 


Time-to-Digital Converters are widely used in scientific ap- 
plications. High resolution TDCs are indispensable in many 
physics experiments, such as time-of-flight (TOF) systems 
for target recoil-ion momentum spectroscopy at Institute of 
Modern Physics (IMP), Chinese Academy of Sciences [1]. 
Also, two interpolation methods used in high-resolution ap- 
plications, the vernier and tapped delay line (TDL) method, 
have been successfully implemented in field-programmable 
gate arrays (FPGAs) [2]. 

At present, TDCs using the tapped-delay line method im- 
plemented in FPGA and taking advantage of reduced delays 
provided by dedicated arithmetic (carry-chain) routing struc- 
tures, are efficient and successful ways to achieve low dead 
time measurements. 

In this work, for a further study on TDC implementation in 
FPGA, we did a detailed analysis of the carry-chain routing 
structures and found that the propagation speeds of rising and 
falling edges should differ from each other. To verify the pre- 
diction, the relationship between the propagation speeds and 
TDC accuracy was tested. The result shows that the propaga- 
tion speed affects the TDC accuracy, hence the importance of 
edge choice in TDC design. 


Il. THE ARCHITECTURE OF THE CARRY CHAIN 


The most common FPGA architecture consists of an array 
of logic blocks, I/O pads, and routing channels. The array 
of logic blocks is called Configurable Logic Block (CLB) or 
Logic Array Block (LAB) depending on vendor. In general, 
it consists of a few logical cells (called ALM, LE, Slice etc.) 
[3-5]. A typical cell has a 4-input Lookup Table (LUT), a 
Full Adder (FA) and a D-type flip-flop, as shown in Fig. 1. 
One tap of carry-chain is marked in the figure. The carry-in 
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signal is a bit carried-in from the next less significant stage, 
and the carry-out signal represents an overflow into the next 
digit of a multi-digit addition. 


One tap of carry chain Carry in clk 


Logic cell 


v 
Carry out clk 


Fig. 1. Simplified example illustration of a logic cell. 


It is possible to create a logical circuit using multiple full 
adders to add N-bit numbers. Each full adder inputs a carry- 
in signal which is the carry-out signal of the previous adder. 
This kind of adder is called a ripple-carry adder. A 4-bit 
ripple-carry adder is shown in Fig. 2. When an N-bit ripple- 
carry adder is implemented in FPGA, N-1 taps of carry chain 
can be obtained. The carry chain is usually used as tapped- 
delay line in TDC design implemented in FPGA. 


y l | | 


Fig. 2. 4-bit ripple-carry adder. 
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Ill. PROPAGATION TIMES OF RISING AND FALLING 
EDGES IN THE CARRY CHAIN 


As the dedicated arithmetic routing structure in FPGAs, a 
carry chain consists of a series of full adders, and the propa- 
gation time in it is composed of the propagation time in the 
full adder, and the propagation time in the cable connecting 
the full adders. Generally, the cables are short, and regarding 
the propagation time they contribute equally to the rising and 
falling edges. So we consider just the propagation time in the 
full adder. 
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Fig. 3. A CMOS full adder schematic. 


By analyzing a full adder circuit, the propagation time 
can be calculated. As most FPGAs are manufactured us- 
ing CMOS process technology, a typical CMOS full adder 
in Fig. 3 will be analyzed in this paper. To make the adder 
work as a delay chain, the input A is set to | and B is set to 
0. Assuming that Cout is equal to Cin in logical, the propaga- 
tion time in the full adder is the time from Cin to Coy. The 
circuit analysis proceeds in the manner mentioned in Ref. [6]. 
Fig. 4 shows the circuit with the FET resistances and the ca- 
pacitances. For convenience, the circuit is divided into two 
stages. The propagation time of Stages 1 and 2 are calculated 
separately. 

Figure 5 shows the sub-circuits for the output transients of 
Stage 1. Rn and Ry are the parasitic resistance of the NMOS 
and PMOS, respectively. Cour: is the output capacitance of 
Stage 1. C's is the parasitic capacitance between the NMOSs, 
and Cy is parasitic capacitance between the PMOSs. The 
propagation time of falling edge is calculated using the cir- 
cuit in Fig. 5(a). The output voltage can be written as 


alt= Va t. (1) 


The two discharge paths are shown as igis.1 and tais.2. AC- 
cording to Elmore formula, the time constant of main dis- 
charge path 74;, 1 is 


Tni = Coni (2Rn). (2) 
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Fig. 4. The circuit for calculating the propagation time. 
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Fig. 5. Discharge circuit (a) and Charging circuit (b) for stage 1. 


Since the input A is set to 1 and B is set to 0, the NMOS1 
and PMOS1 remain in conduction state. So C, does not dis- 
charge during this event, the time constant of discharge path 
idis.2 18 Tn2 = 0. So, the total time constant can be obtained by 
superposing: 


Tna = Tal + Tn2 


3 
= ‘out! (2 Ry). 6) 


Then the propagation time of falling edge in Stage 1, accord- 
ing to Ref. [6], shall be 


tort = In 2Tpa. (4) 


The rise time t, is computed using the circuit in Fig. 5(b). 
The time constant of main charging path ten.1 is 


Tpi = Coutt (2Rp). (5) 


As the PMOS1 remains in conduction state, the time constant 
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of charging path ten.2 is Tp2 = 0. The total time constant is 


Tpa = Tpl F Tp2 


6 
= Voutl (2Rp). ( ) 


Therefore, the propagation time of rising edge in Stage 1 is 


tort = In 27pa- (7) 


(a) (b) Vas 


Fig. 6. Discharge circuit (a) and Charging circuit (b) for stage 2. 


Figure 6 shows the sub-circuit for output transients of Stage 
2. The time constant of discharge path tais is 


Tnb = Cou2Ra- (8) 


where Cou is the output capacitance of Stage 2. The propa- 
gation time of falling edge in Stage 2 is 


tore = In 27. (9) 
The time constant of charging path ten is 

Tpb = Cou Rp- (10) 
So, the propagation time of rising edge in Stage 2 is 

tora = In 27 pp. (11) 


Then, the propagation time of the rising edge from Ci, to 
Cout iS 


tpr = tpfi + tore 


12 
= ln 2(2Cout1 Rn + Cou2 Rp). i i 


The propagation time of the falling edge from Cin to Cout 
is 


tpf = tort =F tpf2 


13 
=In 2(2Cour Rp + CourrRn)- 


The value of Rp and A, can be calculated according to 
Ref. [6]: 


1 
= 14 
fo = RWD Vea — [Vapl)” uo 
l (15) 


Ra e T) ’ 
ka (W/L)n (Vaa — |Venl) 
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where, ki and k, are the nFET and pFET process transcon- 
ductance, respectively (typically k,,/ k, = 2 ~ 3); (W/L)p 
and (W/L), the width-to-length ratio of p-channel MOSFET 
and n-channel MOSFET, respectively; and |Vrp| and |Vrn| are 
the threshold voltage of p-channel MOSFET and n-channel 
MOSFET, respectively. 

Large width-to-length ratio of MOSFET increases the cir- 
cuit speed, but this increases the area of circuit, too. By a 
comprehensive consideration of speed and area, the Rp to Ry 
ratio shall be 


Rp 


x1~3. 16 
FA (16) 


The smaller the R, /R, ratio, the larger area of the circuit. 

Figure 7 shows the tpr and tpr as function of Coun at differ- 
ent Rp/Rn ratios, assuming Ra = 300 Q and Coui = 50 fF. 
It can be seen that larger R,/R, ratio results in greater differ- 
ence between tpr and tpr. Except tpr = typ at Cour = 2Couti; 
tpr is not equal to tp¢ . This depends on the processes of how 
the FPGAs are fabricated. 
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Fig. 7. Propagation time for different Rp to Rn ratio and Cou values 
(Ra = 300, Coutl = 50 fF). 


IV. MEASUREMENT OF THE PROPAGATION TIME OF 
RISING AND FALLING EDGES IN THE CARRY CHAIN 


There are at least two approaches to measure the propa- 
gation time in the carry chain, i.e., the double registration 
approach [7] and statistical approach [8]. They are usually 
used as digital calibration method for TDC implemented in 
FPGA. The double registration approach measures only the 
average cell delay. When the bin widths are different, hence 
the need of bin-by-bin measurement, the statistical approach 
prevails [9] and we use this method. 

The FPGA TDC was used to measure the propagation time 
in the carry chain. A simplified block diagram of the TDC 
implemented in FPGA is shown in Fig. 8 [10]. It includes 
two crucial parts, the time delay lines (the carry chain) and 
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Fig. 8. (Color online) Block diagram of the time-to-digital converter 
implemented in a single FPGA device. 


the coarse time counters. The TDC is based on counter and 
interpolating method, detailed description can be found in 
Ref. [11]. 

The method of measurement is as follows [9]. After power 
up or system reset, the TDC input is fed with calibration hits. 
The timing of these hits should have no correlation with the 
clock signal driving the TDC, so the hits are generated from 
an independent oscillator. 

A Differential Non-Linearity (DNL) histogram is booked 
in the FPGA internal memory. Once all hits are booked into 
the histogram, a sequence controller starts to build a lookup 
table (LUT) in the FPGA internal memory. The LUT is inte- 
grated from the DNL histogram so that it outputs the actual 
time of the center of the addressed bin. 

The LUTs of rising edge and falling edge TDC are shown 
in Fig. 9. The slope can be interpreted as average bin width 
for falling edge TDC, and about half of the slope is for ris- 
ing edge TDC. The bin width means the propagation time of 
the rising edge or falling edge in one tap carry chain. The 
propagation time of falling edge is shorter than that of rising 
edge. This is due to the difference of the N and P transis- 
tors in COMS integrated circuit. The experiment was done in 
several series FPGAs of Altera and Xilinx. 


V. THE RELATIONSHIP BETWEEN PROPAGATION 
TIME AND TDC ACCURACY 


It is intuitive that shorter propagation delay time will re- 
sult in higher accuracy in TDC design. So, the rising edge 
TDC and falling edge TDC are tested respectively. The “ris- 
ing edge TDC” is the TDC that uses the rising edge informa- 
tion to convert time to digital. And the “falling edge TDC” is 
the TDC using the falling edge information. 
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Fig. 9. (a) LUT of EP3C120F484C8N (Altera); (b) LUT of 


EP2S60F1020]4 (Altera); (c) LUT of XC3SSOOEFG320-4 (Xilinx); 
(d) LUT of XCSVFX70TFF1 136-1(Xilinx). 


030401-4 


CARRY-CHAIN PROPAGATION DELAY... 


TABLE 1. The RMS resolution of rising edge TDC and falling edge 
TDC 


Resolution Average bin width 

Rising Falling Rising Falling 
edge edge edge edge 
TDC TDC TDC TDC 
EP3C120F484C8N 95 ps 52ps 169 ps 91ps 
EP2S60F102014 31ps 29 ps 45 ps 44 ps 
XC3S500EFG320-4 69 ps 62 ps 98 ps 96 ps 
XCSVFX70TFF1136-1 28ps 25ps 57ps 55 ps 


The input to the TDC is a pulse train with a repeating rate 
generated by an external phase-locked loop. The TDC is 
driven by a crystal oscillator. The time between successive 
rising edges of the pulse train is measured by the rising edge 
TDC and falling edge TDC, respectively. And digital calibra- 
tion is used in the measurement [9]. 

The test results of successive rising edges and falling edges 
are shown in Table 1. The root meam square (RMS) resolu- 
tions of rising edge TDC and falling edge TDC are 95 ps and 
52 ps, respectively. The falling edge TDC has shorter aver- 
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age propagation delay time. So the shorter propagation delay 
time brings better RMS resolution according to the test. The 
long propagation delay time of rising edge TDC limits the 
resolution. 


VI. CONCLUSION 


The different propagation time in carry chains is due to 
the processes of how the FPGAs are fabricated. Since the 
SRAM based FPGAs are fabricated using processes similar to 
CMOS, so the phenomenon is common in the SRAM based 
FPGAs. Being aware of the situation can help us to under- 
stand the FPGA TDC better. 


The test results show that the impact of this phenomenon to 
the TDC accuracy is obvious. This effect is there in TDCs im- 
plemented FPGA using the tapped-delay line method. Know- 
ing this can help us take advantage of the reduced delays 
provided by dedicated arithmetic (carry-chain) routing struc- 
tures. 
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